But I this feature is the principle of investigation, I care about things want to understand, so the QQ group in turn send information, no one heeded. Alas, depressed. Had to own Google it and teach myself. The following is a detailed
UTF-8 is a Unicode implementation, that is, its byte structure has special requirements, so we say that a Chinese character range is 0x4e00 to 0x9fa5, refers to the Unicode value, as for the utf-8 in the code to be organized by three of bytes, So it
Bom--byte order mark, which is the byte-order mark
There is a character called "ZERO WIDTH no-break SPACE" in the UCS encoding, and its encoding is Feff. Fffe is not a character in UCS, so it should not appear in the actual transmission. The UCS
Encoding problem: why is the response gbk displayed when it is UTF-8? Http://parttime.wengege.com/h/login.html
The response encoding is gbk and UTF-8.
HTTP/1.1 200 OK
Server: nginx/1.4.1
Date: Mon, 09 Jun 2014 15:28:28 GMT
Content-Type:
Before starting this article, I've already made a distinction between Unicode encoding (that is, code point) and Unicode encoding implementation. Otherwise, you will have no sense in the following.
History
We know that the ISO 10646 committee
1. Prerequisites1. character: the minimum unit of abstract text. It has no fixed shape (may be a font shape) and has no value. "A" is a character, and "€" (a symbol of the currency used by Germany, France, and many other European countries) is also
What is the difference between Unicode, UTF-8, and iso8859-1?Will take "Chinese" two words as an example, by looking at the table can know its GB2312 code is "d6d0 CEC4", Unicode Encoding "4e2d 6587", UTF code is "E4b8ad e69687". AttentionThese two
ASCIIThe ASCII code is a 7-bit code with the encoding range of 0x00-0x7f. The ASCII character set includes English letters, Arabic numerals, punctuation marks, and other characters. 0x00-0x20 and 0x7f contain 33 control characters.The system that
Chinese character coding knowledge points ASCII code is a western European code, the use of 7-bit encoding, so it is 2^7=128, a total of 128 conceited, including 34 characters, (such as line LF, enter CR, etc.), the remaining 94 are English
What is the difference between Unicode, UTF-8, and ISO8859-1? utf-8iso8859-1
Note: This article is reproduced on Sina Blog to facilitate knowledge summarization. Address: http://blog.sina.com.cn/s/blog_673c81990100t1lc.html
This article mainly
The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion;
products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the
content of the page makes you feel confusing, please write us an email, we will handle the problem
within 5 days after receiving your email.
If you find any instances of plagiarism from the community, please send an email to:
info-contact@alibabacloud.com
and provide relevant evidence. A staff member will contact you within 5 working days.